# Hybrid Mamba-Transformer

Mambavision L3 256 21K
Other
The first hybrid computer vision model combining the strengths of Mamba and Transformer, enhancing visual feature modeling efficiency by reconstructing the Mamba formula, and introducing self-attention modules in the final layers of the Mamba architecture to improve long-range spatial dependency modeling.
Image Classification Transformers
M
nvidia
510
7
Mambavision L2 512 21K
Other
The first hybrid computer vision model combining the advantages of Mamba and Transformer, enhancing visual feature modeling capability by reconstructing the Mamba formula
Image Classification Transformers
M
nvidia
2,678
3
Mambavision L 21K
Other
MambaVision is a hybrid Mamba-Transformer visual backbone network designed for vision applications, combining the strengths of the Mamba formula and vision Transformers, delivering outstanding performance in image classification and downstream vision tasks.
Image Classification Transformers
M
nvidia
571
4
Mambavision B 21K
Other
The first hybrid computer vision model combining the strengths of Mamba and Transformer, enhancing visual feature modeling efficiency through reconstructed Mamba formulas and introducing self-attention modules at the end of the Mamba architecture to improve long-range spatial dependency modeling.
Image Classification Transformers
M
nvidia
1,395
4
Vamba Qwen2 VL 7B
MIT
Vamba is a hybrid Mamba-Transformer architecture that achieves efficient long video understanding through cross-attention layers and Mamba-2 modules.
Video-to-Text Transformers
V
TIGER-Lab
806
16
Mambavision T2 1K
Other
The first hybrid computer vision model combining the strengths of Mamba and Transformer, enhancing visual feature modeling through redesigned Mamba formulations and incorporating self-attention modules in the Mamba architecture to improve long-range spatial dependency modeling.
Image Classification Transformers
M
nvidia
597
4
Mambavision T 1K
Other
MambaVision is the first hybrid computer vision model combining the advantages of Mamba and Transformer, significantly enhancing the modeling capability of long-range spatial dependencies through redesigned Mamba formulas and integrated ViT modules.
Image Classification Transformers
M
nvidia
2,323
31
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase